Overview

Dataset statistics

Number of variables18
Number of observations65280
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory16.5 MiB
Average record size in memory265.7 B

Variable types

Numeric13
Categorical4
DateTime1

Alerts

Item has a high cardinality: 657 distinct values High cardinality
df_index is highly correlated with Invoice_Quarter and 1 other fieldsHigh correlation
Invoice_Quarter is highly correlated with df_index and 1 other fieldsHigh correlation
Invoice_Month is highly correlated with df_index and 1 other fieldsHigh correlation
Sales Quantity is highly correlated with List Price and 1 other fieldsHigh correlation
Sales Amount is highly correlated with Sales Amount Based on List Price and 3 other fieldsHigh correlation
Sales Amount Based on List Price is highly correlated with Sales Amount and 3 other fieldsHigh correlation
Discount Amount is highly correlated with Sales Amount and 3 other fieldsHigh correlation
Sales Margin Amount is highly correlated with Sales Amount and 3 other fieldsHigh correlation
Sales Cost Amount is highly correlated with Sales Amount and 3 other fieldsHigh correlation
List Price is highly correlated with Sales Quantity and 1 other fieldsHigh correlation
Sales Price is highly correlated with Sales Quantity and 1 other fieldsHigh correlation
df_index is highly correlated with Invoice_Quarter and 1 other fieldsHigh correlation
Invoice_Quarter is highly correlated with df_index and 1 other fieldsHigh correlation
Invoice_Month is highly correlated with df_index and 1 other fieldsHigh correlation
Sales Quantity is highly correlated with Sales Amount and 3 other fieldsHigh correlation
Sales Amount is highly correlated with Sales Quantity and 3 other fieldsHigh correlation
Sales Amount Based on List Price is highly correlated with Sales Quantity and 4 other fieldsHigh correlation
Discount Amount is highly correlated with Sales Amount Based on List PriceHigh correlation
Sales Margin Amount is highly correlated with Sales Quantity and 3 other fieldsHigh correlation
Sales Cost Amount is highly correlated with Sales Quantity and 3 other fieldsHigh correlation
List Price is highly correlated with Sales PriceHigh correlation
Sales Price is highly correlated with List PriceHigh correlation
df_index is highly correlated with Invoice_Quarter and 1 other fieldsHigh correlation
Invoice_Quarter is highly correlated with df_index and 1 other fieldsHigh correlation
Invoice_Month is highly correlated with df_index and 1 other fieldsHigh correlation
Sales Amount is highly correlated with Sales Amount Based on List Price and 3 other fieldsHigh correlation
Sales Amount Based on List Price is highly correlated with Sales Amount and 3 other fieldsHigh correlation
Discount Amount is highly correlated with Sales Amount and 3 other fieldsHigh correlation
Sales Margin Amount is highly correlated with Sales Amount and 3 other fieldsHigh correlation
Sales Cost Amount is highly correlated with Sales Amount and 3 other fieldsHigh correlation
List Price is highly correlated with Sales PriceHigh correlation
Sales Price is highly correlated with List PriceHigh correlation
df_index is highly correlated with Invoice_Year and 2 other fieldsHigh correlation
CustKey is highly correlated with Sales RepHigh correlation
Invoice_Year is highly correlated with df_index and 1 other fieldsHigh correlation
Invoice_Quarter is highly correlated with df_index and 1 other fieldsHigh correlation
Invoice_Month is highly correlated with df_index and 2 other fieldsHigh correlation
Sales Quantity is highly correlated with Sales Amount and 3 other fieldsHigh correlation
Sales Amount is highly correlated with Sales Quantity and 4 other fieldsHigh correlation
Sales Amount Based on List Price is highly correlated with Sales Quantity and 4 other fieldsHigh correlation
Discount Amount is highly correlated with Sales Amount and 3 other fieldsHigh correlation
Sales Margin Amount is highly correlated with Sales Quantity and 4 other fieldsHigh correlation
Sales Cost Amount is highly correlated with Sales Quantity and 4 other fieldsHigh correlation
Sales Rep is highly correlated with CustKeyHigh correlation
List Price is highly correlated with Sales PriceHigh correlation
Sales Price is highly correlated with List PriceHigh correlation
Sales Quantity is highly skewed (γ1 = 23.00722057) Skewed
Sales Cost Amount is highly skewed (γ1 = 21.01063149) Skewed
df_index is uniformly distributed Uniform
df_index has unique values Unique
Discount Amount has 1214 (1.9%) zeros Zeros

Reproduction

Analysis started2022-10-06 03:19:54.147763
Analysis finished2022-10-06 03:20:20.926352
Duration26.78 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIFORM
UNIQUE

Distinct65280
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32640.96425
Minimum0
Maximum65281
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size510.1 KiB
2022-10-06T08:50:21.012285image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3264.95
Q116320.75
median32640.5
Q348961.25
95-th percentile62017.05
Maximum65281
Range65281
Interquartile range (IQR)32640.5

Descriptive statistics

Standard deviation18845.29037
Coefficient of variation (CV)0.5773509086
Kurtosis-1.200026217
Mean32640.96425
Median Absolute Deviation (MAD)16320.5
Skewness5.038305807 × 10-6
Sum2130802146
Variance355144969
MonotonicityStrictly increasing
2022-10-06T08:50:21.110138image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
435271
 
< 0.1%
435141
 
< 0.1%
435151
 
< 0.1%
435161
 
< 0.1%
435171
 
< 0.1%
435181
 
< 0.1%
435191
 
< 0.1%
435201
 
< 0.1%
435211
 
< 0.1%
Other values (65270)65270
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
652811
< 0.1%
652801
< 0.1%
652791
< 0.1%
652781
< 0.1%
652771
< 0.1%
652761
< 0.1%
652751
< 0.1%
652741
< 0.1%
652731
< 0.1%
652721
< 0.1%

CustKey
Real number (ℝ≥0)

HIGH CORRELATION

Distinct615
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10017702.67
Minimum10000453
Maximum10027583
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size510.1 KiB
2022-10-06T08:50:21.229177image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum10000453
5-th percentile10002506
Q110012715
median10019665
Q310023511
95-th percentile10026006.15
Maximum10027583
Range27130
Interquartile range (IQR)10796

Descriptive statistics

Standard deviation7176.243993
Coefficient of variation (CV)0.0007163562571
Kurtosis-0.3714057462
Mean10017702.67
Median Absolute Deviation (MAD)4886
Skewness-0.7701959384
Sum6.539556306 × 1011
Variance51498477.84
MonotonicityNot monotonic
2022-10-06T08:50:21.343200image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100259192760
 
4.2%
100191942752
 
4.2%
100127151431
 
2.2%
100122261389
 
2.1%
100250251143
 
1.8%
100235241042
 
1.6%
100205151010
 
1.5%
10017638792
 
1.2%
10022456741
 
1.1%
10002506714
 
1.1%
Other values (605)51506
78.9%
ValueCountFrequency (%)
10000453329
0.5%
1000045519
 
< 0.1%
10000456104
 
0.2%
1000045719
 
< 0.1%
1000045810
 
< 0.1%
10000460120
 
0.2%
10000461251
0.4%
100004623
 
< 0.1%
10000466123
 
0.2%
10000469162
0.2%
ValueCountFrequency (%)
1002758325
 
< 0.1%
100275755
 
< 0.1%
1002757252
 
0.1%
1002756042
 
0.1%
10027381108
0.2%
10027370235
0.4%
1002735621
 
< 0.1%
1002734814
 
< 0.1%
1002734035
 
0.1%
10027119176
0.3%

Item
Categorical

HIGH CARDINALITY

Distinct657
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
Better Fancy Canned Sardines
 
1648
Ebony Prepared Salad
 
1471
Moms Sliced Turkey
 
1192
Imagine Popsicles
 
1191
Discover Manicotti
 
1126
Other values (652)
58652 

Length

Max length37
Median length32
Mean length21.72234988
Min length8

Characters and Unicode

Total characters1418035
Distinct characters56
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)< 0.1%

Sample

1st rowUrban Large Eggs
2nd rowMoms Sliced Turkey
3rd rowCutting Edge Foot-Long Hot Dogs
4th rowKiwi Lox
5th rowHigh Top Sweet Onion

Common Values

ValueCountFrequency (%)
Better Fancy Canned Sardines1648
 
2.5%
Ebony Prepared Salad1471
 
2.3%
Moms Sliced Turkey1192
 
1.8%
Imagine Popsicles1191
 
1.8%
Discover Manicotti1126
 
1.7%
Red Spade Foot-Long Hot Dogs1075
 
1.6%
High Top Dried Mushrooms1073
 
1.6%
Big Time Frozen Cheese Pizza947
 
1.5%
Cutting Edge Foot-Long Hot Dogs942
 
1.4%
Bravo Large Canned Shrimp941
 
1.4%
Other values (647)53674
82.2%

Length

2022-10-06T08:50:21.457190image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
canned6378
 
2.8%
ebony5460
 
2.4%
cheese5194
 
2.3%
better4570
 
2.0%
red4271
 
1.9%
top4173
 
1.8%
spade4161
 
1.8%
high4138
 
1.8%
best3480
 
1.5%
nationeel3328
 
1.4%
Other values (294)184532
80.3%

Most occurring characters

ValueCountFrequency (%)
164405
 
11.6%
e147096
 
10.4%
o92465
 
6.5%
a92274
 
6.5%
n74072
 
5.2%
i69458
 
4.9%
t68731
 
4.8%
r67351
 
4.7%
l59835
 
4.2%
s57796
 
4.1%
Other values (46)524552
37.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1013733
71.5%
Uppercase Letter236241
 
16.7%
Space Separator164405
 
11.6%
Dash Punctuation2160
 
0.2%
Other Punctuation748
 
0.1%
Decimal Number748
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e147096
14.5%
o92465
 
9.1%
a92274
 
9.1%
n74072
 
7.3%
i69458
 
6.9%
t68731
 
6.8%
r67351
 
6.6%
l59835
 
5.9%
s57796
 
5.7%
d40545
 
4.0%
Other values (16)244110
24.1%
Uppercase Letter
ValueCountFrequency (%)
B31428
13.3%
C30873
13.1%
S26130
11.1%
T19374
 
8.2%
F16239
 
6.9%
M12520
 
5.3%
P11469
 
4.9%
L11022
 
4.7%
D10754
 
4.6%
E10237
 
4.3%
Other values (15)56195
23.8%
Decimal Number
ValueCountFrequency (%)
1579
77.4%
2169
 
22.6%
Space Separator
ValueCountFrequency (%)
164405
100.0%
Dash Punctuation
ValueCountFrequency (%)
-2160
100.0%
Other Punctuation
ValueCountFrequency (%)
%748
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1249974
88.1%
Common168061
 
11.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e147096
 
11.8%
o92465
 
7.4%
a92274
 
7.4%
n74072
 
5.9%
i69458
 
5.6%
t68731
 
5.5%
r67351
 
5.4%
l59835
 
4.8%
s57796
 
4.6%
d40545
 
3.2%
Other values (41)480351
38.4%
Common
ValueCountFrequency (%)
164405
97.8%
-2160
 
1.3%
%748
 
0.4%
1579
 
0.3%
2169
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1418035
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
164405
 
11.6%
e147096
 
10.4%
o92465
 
6.5%
a92274
 
6.5%
n74072
 
5.2%
i69458
 
4.9%
t68731
 
4.8%
r67351
 
4.7%
l59835
 
4.2%
s57796
 
4.1%
Other values (46)524552
37.0%
Distinct559
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size510.1 KiB
Minimum2017-01-01 00:00:00
Maximum2019-12-31 00:00:00
2022-10-06T08:50:21.564078image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:21.686169image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Invoice_Year
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.8 MiB
2017
30573 
2019
28021 
2018
6686 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters261120
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017
2nd row2017
3rd row2017
4th row2017
5th row2017

Common Values

ValueCountFrequency (%)
201730573
46.8%
201928021
42.9%
20186686
 
10.2%

Length

2022-10-06T08:50:21.799419image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-06T08:50:22.032560image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
201730573
46.8%
201928021
42.9%
20186686
 
10.2%

Most occurring characters

ValueCountFrequency (%)
265280
25.0%
065280
25.0%
165280
25.0%
730573
11.7%
928021
10.7%
86686
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number261120
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
265280
25.0%
065280
25.0%
165280
25.0%
730573
11.7%
928021
10.7%
86686
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Common261120
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
265280
25.0%
065280
25.0%
165280
25.0%
730573
11.7%
928021
10.7%
86686
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII261120
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
265280
25.0%
065280
25.0%
165280
25.0%
730573
11.7%
928021
10.7%
86686
 
2.6%

Invoice_Quarter
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.6 MiB
1
19930 
4
16142 
3
14688 
2
14520 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters65280
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row3
3rd row4
4th row2
5th row2

Common Values

ValueCountFrequency (%)
119930
30.5%
416142
24.7%
314688
22.5%
214520
22.2%

Length

2022-10-06T08:50:22.219615image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-06T08:50:22.309458image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
119930
30.5%
416142
24.7%
314688
22.5%
214520
22.2%

Most occurring characters

ValueCountFrequency (%)
119930
30.5%
416142
24.7%
314688
22.5%
214520
22.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number65280
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
119930
30.5%
416142
24.7%
314688
22.5%
214520
22.2%

Most occurring scripts

ValueCountFrequency (%)
Common65280
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
119930
30.5%
416142
24.7%
314688
22.5%
214520
22.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII65280
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
119930
30.5%
416142
24.7%
314688
22.5%
214520
22.2%

Invoice_Month
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.307000613
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size510.1 KiB
2022-10-06T08:50:22.390125image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q39
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.563557849
Coefficient of variation (CV)0.5650162522
Kurtosis-1.304080373
Mean6.307000613
Median Absolute Deviation (MAD)3
Skewness0.07649377464
Sum411721
Variance12.69894454
MonotonicityNot monotonic
2022-10-06T08:50:22.468402image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
37308
11.2%
26556
10.0%
16066
9.3%
125645
8.6%
95555
8.5%
65376
8.2%
105250
8.0%
115247
8.0%
55167
7.9%
84737
7.3%
Other values (2)8373
12.8%
ValueCountFrequency (%)
16066
9.3%
26556
10.0%
37308
11.2%
43977
6.1%
55167
7.9%
65376
8.2%
74396
6.7%
84737
7.3%
95555
8.5%
105250
8.0%
ValueCountFrequency (%)
125645
8.6%
115247
8.0%
105250
8.0%
95555
8.5%
84737
7.3%
74396
6.7%
65376
8.2%
55167
7.9%
43977
6.1%
37308
11.2%

Invoice_Day
Real number (ℝ≥0)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.15589767
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size510.1 KiB
2022-10-06T08:50:22.556181image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q19
median16
Q324
95-th percentile30
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.795337539
Coefficient of variation (CV)0.5444041376
Kurtosis-1.224628553
Mean16.15589767
Median Absolute Deviation (MAD)8
Skewness-0.02103293504
Sum1054657
Variance77.35796243
MonotonicityNot monotonic
2022-10-06T08:50:22.645406image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
112879
 
4.4%
292831
 
4.3%
102571
 
3.9%
182487
 
3.8%
262436
 
3.7%
232396
 
3.7%
212334
 
3.6%
302333
 
3.6%
92271
 
3.5%
52268
 
3.5%
Other values (21)40474
62.0%
ValueCountFrequency (%)
11850
2.8%
21671
2.6%
32001
3.1%
41701
2.6%
52268
3.5%
62115
3.2%
72185
3.3%
82245
3.4%
92271
3.5%
102571
3.9%
ValueCountFrequency (%)
311019
 
1.6%
302333
3.6%
292831
4.3%
282093
3.2%
272140
3.3%
262436
3.7%
252123
3.3%
242065
3.2%
232396
3.7%
222220
3.4%

Sales Quantity
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct279
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean45.08570772
Minimum1
Maximum16000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size510.1 KiB
2022-10-06T08:50:22.747244image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q38
95-th percentile86
Maximum16000
Range15999
Interquartile range (IQR)6

Descriptive statistics

Standard deviation429.6683008
Coefficient of variation (CV)9.530033408
Kurtosis649.737599
Mean45.08570772
Median Absolute Deviation (MAD)2
Skewness23.00722057
Sum2943195
Variance184614.8487
MonotonicityNot monotonic
2022-10-06T08:50:22.849214image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
115264
23.4%
213466
20.6%
37056
10.8%
44973
 
7.6%
53519
 
5.4%
63061
 
4.7%
102596
 
4.0%
81460
 
2.2%
121314
 
2.0%
201034
 
1.6%
Other values (269)11537
17.7%
ValueCountFrequency (%)
115264
23.4%
213466
20.6%
37056
10.8%
44973
 
7.6%
53519
 
5.4%
63061
 
4.7%
7711
 
1.1%
81460
 
2.2%
9453
 
0.7%
102596
 
4.0%
ValueCountFrequency (%)
1600011
 
< 0.1%
1360012
 
< 0.1%
95047
 
< 0.1%
831640
0.1%
712821
< 0.1%
71262
 
< 0.1%
64802
 
< 0.1%
64004
 
< 0.1%
58344
 
< 0.1%
475213
 
< 0.1%

Sales Amount
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct17895
Distinct (%)27.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2852.043002
Minimum200.01
Maximum555376
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size510.1 KiB
2022-10-06T08:50:22.962476image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum200.01
5-th percentile215.78
Q1308.38
median553.94
Q31279.9875
95-th percentile8777.79
Maximum555376
Range555175.99
Interquartile range (IQR)971.6075

Descriptive statistics

Standard deviation15164.56904
Coefficient of variation (CV)5.3170899
Kurtosis478.9000606
Mean2852.043002
Median Absolute Deviation (MAD)292.92
Skewness18.57841346
Sum186181367.2
Variance229964154.3
MonotonicityNot monotonic
2022-10-06T08:50:23.057640image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
784.97115
 
0.2%
817.68115
 
0.2%
294.72110
 
0.2%
307104
 
0.2%
597.14102
 
0.2%
622.02101
 
0.2%
824.39100
 
0.2%
791.4199
 
0.2%
401.1695
 
0.1%
204.6692
 
0.1%
Other values (17885)64247
98.4%
ValueCountFrequency (%)
200.016
< 0.1%
200.066
< 0.1%
200.081
 
< 0.1%
200.143
< 0.1%
200.155
< 0.1%
200.197
< 0.1%
200.211
 
< 0.1%
200.33
< 0.1%
200.361
 
< 0.1%
200.376
< 0.1%
ValueCountFrequency (%)
5553761
 
< 0.1%
5392005
< 0.1%
5176325
< 0.1%
472069.62
 
< 0.1%
4583205
< 0.1%
439987.25
< 0.1%
310156.071
 
< 0.1%
301122.42
 
< 0.1%
2972401
 
< 0.1%
289077.52
 
< 0.1%

Sales Amount Based on List Price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4060
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4707.617837
Minimum0
Maximum632610.16
Zeros294
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size510.1 KiB
2022-10-06T08:50:23.166225image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile390
Q1561.04
median998.16
Q32316.63
95-th percentile16425.12
Maximum632610.16
Range632610.16
Interquartile range (IQR)1755.59

Descriptive statistics

Standard deviation20696.74443
Coefficient of variation (CV)4.396436827
Kurtosis278.7073688
Mean4707.617837
Median Absolute Deviation (MAD)524.88
Skewness14.07462245
Sum307313292.4
Variance428355229.8
MonotonicityNot monotonic
2022-10-06T08:50:23.269609image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1431.23590
 
0.9%
1627.84530
 
0.8%
803.86498
 
0.8%
596448
 
0.7%
1254.1899418
 
0.6%
966.44376
 
0.6%
439.7372
 
0.6%
507.75363
 
0.6%
767.75348
 
0.5%
939.57343
 
0.5%
Other values (4050)60994
93.4%
ValueCountFrequency (%)
0294
0.5%
1942
 
< 0.1%
195.611
 
< 0.1%
198.3961
 
< 0.1%
198.631
 
< 0.1%
200.78
 
< 0.1%
200.81
 
< 0.1%
201.693
 
< 0.1%
202.141
 
< 0.1%
202.61
 
< 0.1%
ValueCountFrequency (%)
632610.165
< 0.1%
624453.752
 
< 0.1%
53920011
< 0.1%
45832012
< 0.1%
391924.72325
< 0.1%
3873958
< 0.1%
348655.52
 
< 0.1%
332196.4052
 
< 0.1%
330708.37925
< 0.1%
310273.73922
 
< 0.1%

Discount Amount
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct17820
Distinct (%)27.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1855.574835
Minimum-255820.8
Maximum343532.66
Zeros1214
Zeros (%)1.9%
Negative972
Negative (%)1.5%
Memory size510.1 KiB
2022-10-06T08:50:23.375684image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-255820.8
5-th percentile18.68
Q1246.0375
median441.76
Q3999.76
95-th percentile6353
Maximum343532.66
Range599353.46
Interquartile range (IQR)753.7225

Descriptive statistics

Standard deviation9037.140888
Coefficient of variation (CV)4.870264847
Kurtosis379.7363588
Mean1855.574835
Median Absolute Deviation (MAD)233.935
Skewness10.84177856
Sum121131925.2
Variance81669915.44
MonotonicityNot monotonic
2022-10-06T08:50:23.477160image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01214
 
1.9%
24.88103
 
0.2%
606.84100
 
0.2%
639.8297
 
0.1%
601.903393
 
0.1%
402.793
 
0.1%
634.613393
 
0.1%
918.141288
 
0.1%
169.3687
 
0.1%
385.9887
 
0.1%
Other values (17810)63225
96.9%
ValueCountFrequency (%)
-255820.81
 
< 0.1%
-245587.971
 
< 0.1%
-238792.731
 
< 0.1%
-231837.63
< 0.1%
-222564.13
< 0.1%
-1271761
 
< 0.1%
-122088.961
 
< 0.1%
-84573.721
 
< 0.1%
-81190.771
 
< 0.1%
-536261
 
< 0.1%
ValueCountFrequency (%)
343532.662
< 0.1%
339103.351
 
< 0.1%
331487.762
< 0.1%
327213.751
 
< 0.1%
322454.091
 
< 0.1%
2103714
< 0.1%
2029954
< 0.1%
191196.55322
< 0.1%
189333.91
 
< 0.1%
182832.88322
< 0.1%

Sales Margin Amount
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct21295
Distinct (%)32.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1191.012887
Minimum-3932.93
Maximum188800
Zeros3
Zeros (%)< 0.1%
Negative576
Negative (%)0.9%
Memory size510.1 KiB
2022-10-06T08:50:23.586201image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-3932.93
5-th percentile61.54
Q1129.9475
median246.49
Q3579.39
95-th percentile3824.43
Maximum188800
Range192732.93
Interquartile range (IQR)449.4425

Descriptive statistics

Standard deviation5860.857507
Coefficient of variation (CV)4.920901841
Kurtosis324.9276471
Mean1191.012887
Median Absolute Deviation (MAD)140.265
Skewness15.57141451
Sum77749321.25
Variance34349650.72
MonotonicityNot monotonic
2022-10-06T08:50:23.698159image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
374.793
 
0.1%
5317.1788
 
0.1%
6235.3187
 
0.1%
341.7284
 
0.1%
15.3269
 
0.1%
37.0867
 
0.1%
52.867
 
0.1%
431.8864
 
0.1%
464.5964
 
0.1%
24.5363
 
0.1%
Other values (21285)64534
98.9%
ValueCountFrequency (%)
-3932.931
< 0.1%
-3764.42
< 0.1%
-3673.682
< 0.1%
-3608.811
< 0.1%
-3414.012
< 0.1%
-3132.652
< 0.1%
-2533.972
< 0.1%
-2508.212
< 0.1%
-2488.891
< 0.1%
-2103.042
< 0.1%
ValueCountFrequency (%)
1888001
 
< 0.1%
185907.22
< 0.1%
1726243
< 0.1%
164339.22
< 0.1%
1604802
< 0.1%
156773.41
 
< 0.1%
156521.041
 
< 0.1%
1510563
< 0.1%
148401.63
< 0.1%
147487.372
< 0.1%

Sales Cost Amount
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct5513
Distinct (%)8.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1661.030116
Minimum0
Maximum366576
Zeros347
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size510.1 KiB
2022-10-06T08:50:23.808707image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile85.36
Q1167.79
median304.53
Q3687.4
95-th percentile4946.11
Maximum366576
Range366576
Interquartile range (IQR)519.61

Descriptive statistics

Standard deviation9556.62722
Coefficient of variation (CV)5.753434047
Kurtosis614.2579832
Mean1661.030116
Median Absolute Deviation (MAD)171.22
Skewness21.01063149
Sum108432045.9
Variance91329123.83
MonotonicityNot monotonic
2022-10-06T08:50:23.913621image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
449.69534
 
0.8%
475.75457
 
0.7%
0347
 
0.5%
134.67305
 
0.5%
162.89289
 
0.4%
205.72253
 
0.4%
159.14242
 
0.4%
16718.08234
 
0.4%
546.44231
 
0.4%
344.28229
 
0.4%
Other values (5503)62159
95.2%
ValueCountFrequency (%)
0347
0.5%
12.972
 
< 0.1%
19.554
 
< 0.1%
20.86
 
< 0.1%
261
 
< 0.1%
31.194
 
< 0.1%
33.973
 
< 0.1%
35.482
 
< 0.1%
35.545
 
< 0.1%
36.031
 
< 0.1%
ValueCountFrequency (%)
3665767
 
< 0.1%
353292.84
 
< 0.1%
311589.612
 
< 0.1%
185048.852
 
< 0.1%
161446.355
 
< 0.1%
157412.852
 
< 0.1%
153635.035
 
< 0.1%
146630.44
 
< 0.1%
141265.5636
0.1%
137736.244
 
< 0.1%

Sales Rep
Real number (ℝ≥0)

HIGH CORRELATION

Distinct64
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean137.4231924
Minimum103
Maximum185
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size510.1 KiB
2022-10-06T08:50:24.027952image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum103
5-th percentile104
Q1113
median134
Q3160
95-th percentile180
Maximum185
Range82
Interquartile range (IQR)47

Descriptive statistics

Standard deviation26.64392588
Coefficient of variation (CV)0.1938823092
Kurtosis-1.301510627
Mean137.4231924
Median Absolute Deviation (MAD)23
Skewness0.3506954731
Sum8970986
Variance709.8987865
MonotonicityNot monotonic
2022-10-06T08:50:24.135633image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1086225
 
9.5%
1804427
 
6.8%
1432926
 
4.5%
1172442
 
3.7%
1032162
 
3.3%
1042065
 
3.2%
1342033
 
3.1%
1151988
 
3.0%
1251967
 
3.0%
1571744
 
2.7%
Other values (54)37301
57.1%
ValueCountFrequency (%)
1032162
 
3.3%
1042065
 
3.2%
1051184
 
1.8%
1071304
 
2.0%
1086225
9.5%
1091137
 
1.7%
110594
 
0.9%
111542
 
0.8%
112486
 
0.7%
1131422
 
2.2%
ValueCountFrequency (%)
185538
 
0.8%
184229
 
0.4%
183326
 
0.5%
182808
 
1.2%
181792
 
1.2%
1804427
6.8%
179875
 
1.3%
1761083
 
1.7%
1751322
 
2.0%
173795
 
1.2%

U/M
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.7 MiB
EA
58992 
SE
 
5629
PR
 
659

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters130560
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEA
2nd rowEA
3rd rowEA
4th rowEA
5th rowSE

Common Values

ValueCountFrequency (%)
EA58992
90.4%
SE5629
 
8.6%
PR659
 
1.0%

Length

2022-10-06T08:50:24.231636image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-06T08:50:24.320756image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
ea58992
90.4%
se5629
 
8.6%
pr659
 
1.0%

Most occurring characters

ValueCountFrequency (%)
E64621
49.5%
A58992
45.2%
S5629
 
4.3%
P659
 
0.5%
R659
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter130560
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E64621
49.5%
A58992
45.2%
S5629
 
4.3%
P659
 
0.5%
R659
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Latin130560
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E64621
49.5%
A58992
45.2%
S5629
 
4.3%
P659
 
0.5%
R659
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII130560
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E64621
49.5%
A58992
45.2%
S5629
 
4.3%
P659
 
0.5%
R659
 
0.5%

List Price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1062
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean514.7091493
Minimum0
Maximum2760.7
Zeros294
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size510.1 KiB
2022-10-06T08:50:24.407345image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile36.69
Q1181.56
median325.19
Q3803.86
95-th percentile1431.23
Maximum2760.7
Range2760.7
Interquartile range (IQR)622.3

Descriptive statistics

Standard deviation449.1870286
Coefficient of variation (CV)0.8727006878
Kurtosis0.01247467261
Mean514.7091493
Median Absolute Deviation (MAD)217.35
Skewness1.0054526
Sum33600213.26
Variance201768.9867
MonotonicityNot monotonic
2022-10-06T08:50:24.509763image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2981508
 
2.3%
1431.231426
 
2.2%
966.441192
 
1.8%
1275.11126
 
1.7%
192.341041
 
1.6%
1627.841035
 
1.6%
157.76988
 
1.5%
1084.61975
 
1.5%
181.44893
 
1.4%
412.03892
 
1.4%
Other values (1052)54204
83.0%
ValueCountFrequency (%)
0294
0.5%
0.3929150
0.2%
0.421
 
< 0.1%
0.40525
 
< 0.1%
0.4110
 
< 0.1%
0.4456
 
< 0.1%
0.521
 
< 0.1%
0.614
 
< 0.1%
1.62362
 
< 0.1%
1.87119
 
< 0.1%
ValueCountFrequency (%)
2760.712
 
< 0.1%
2291.47
 
< 0.1%
226710
 
< 0.1%
1975113
0.2%
192061
0.1%
188019
 
< 0.1%
1759.445
 
0.1%
1731.435
 
0.1%
1691.412
 
< 0.1%
1688.13150
0.2%

Sales Price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct14788
Distinct (%)22.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean283.6968508
Minimum0.3373411765
Maximum6035
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size510.1 KiB
2022-10-06T08:50:24.888838image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0.3373411765
5-th percentile22.42555556
Q1100.07
median183.75825
Q3448.22
95-th percentile789.66725
Maximum6035
Range6034.662659
Interquartile range (IQR)348.15

Descriptive statistics

Standard deviation252.0316598
Coefficient of variation (CV)0.8883837061
Kurtosis6.882532688
Mean283.6968508
Median Absolute Deviation (MAD)116.47175
Skewness1.418975272
Sum18519730.42
Variance63519.95752
MonotonicityNot monotonic
2022-10-06T08:50:24.987909image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
140.43191
 
0.3%
817.68189
 
0.3%
133.41181
 
0.3%
783.17138
 
0.2%
824.39138
 
0.2%
23.47136
 
0.2%
82.87333333125
 
0.2%
221.04120
 
0.2%
230.25120
 
0.2%
230.98120
 
0.2%
Other values (14778)63822
97.8%
ValueCountFrequency (%)
0.33734117652
 
< 0.1%
0.35141
 
< 0.1%
0.36194117651
 
< 0.1%
0.3771867
0.1%
0.3849
 
< 0.1%
0.388812
 
< 0.1%
0.392967
0.1%
0.39365
 
< 0.1%
0.49
 
< 0.1%
0.4046916
 
< 0.1%
ValueCountFrequency (%)
60351
< 0.1%
37482
< 0.1%
3233.361
< 0.1%
3009.861
< 0.1%
3003.411
< 0.1%
28231
< 0.1%
2753.321
< 0.1%
25601
< 0.1%
2540.171
< 0.1%
2360.11
< 0.1%

Interactions

2022-10-06T08:50:18.569025image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:49:59.899817image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:01.432111image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:02.821537image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:04.288468image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:06.078967image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:07.587550image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:08.939738image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:10.707664image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:12.481063image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:14.143755image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:15.703384image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:17.217557image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:18.684588image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:00.121264image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:01.547105image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:02.924018image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:04.397022image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:06.186741image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:07.691061image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:09.062076image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:10.840557image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:12.585814image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:14.261902image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:15.809959image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:17.336750image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:18.802135image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:00.225735image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:01.653786image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:03.021014image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:04.514571image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:06.288119image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:07.793631image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:09.169183image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:10.960271image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:12.689308image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:14.372732image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:15.918294image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:17.445233image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:18.909143image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:00.331125image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:01.759810image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:03.110396image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:04.621578image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:06.399097image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:07.895939image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:09.272233image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:11.080816image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:12.948070image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:14.493651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:16.015172image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:17.547821image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:19.207330image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:00.435724image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:01.873651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:03.303901image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:04.728008image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:06.505611image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:07.989112image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:09.381315image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:11.185770image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:13.049115image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:14.607561image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:16.117635image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:17.650526image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:19.323590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:00.552377image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:01.982247image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:03.403568image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:04.913552image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:06.614974image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:08.094062image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:09.501605image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:11.336080image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:13.158917image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:14.750888image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:16.265773image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:17.762707image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:19.426558image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:00.666238image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:02.081377image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:03.507868image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:05.079779image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:06.720108image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:08.188746image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:09.710131image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:11.498590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:13.265849image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:14.885473image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:16.411736image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:17.867190image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:19.539802image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:00.792553image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:02.183438image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:03.634649image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:05.252739image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:06.827420image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:08.294510image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:09.851617image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:11.675581image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:13.395480image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:15.008940image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:16.548998image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:17.970507image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:19.659112image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:00.901984image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:02.282604image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:03.754251image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:05.394057image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:06.931940image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:08.394944image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:09.983050image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:11.846723image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:13.526784image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:15.117699image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:16.660953image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:18.070898image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:19.777266image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:01.020571image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:02.381584image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:03.868996image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:05.612435image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:07.040797image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:08.511582image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:10.108056image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:12.039976image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:13.667643image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:15.240018image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:16.807655image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:18.175328image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:19.896551image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:01.128866image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:02.493190image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:03.989071image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:05.751273image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:07.146827image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:08.620751image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:10.234168image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:12.159306image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:13.794921image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:15.362890image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:16.911953image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:18.277906image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:20.004642image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:01.227360image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:02.593948image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:04.085598image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:05.878089image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:07.254057image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:08.716993image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:10.346735image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:12.278272image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:13.906306image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:15.476636image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:17.003772image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:18.371767image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:20.112236image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:01.329300image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:02.699151image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:04.182640image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:05.975644image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:07.483317image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:08.821637image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:10.553571image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:12.375261image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:14.023917image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:15.592224image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:17.100893image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-10-06T08:50:18.467391image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-10-06T08:50:25.092392image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-10-06T08:50:25.270955image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-10-06T08:50:25.451412image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-10-06T08:50:25.614901image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-10-06T08:50:25.731468image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-10-06T08:50:20.324574image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-10-06T08:50:20.646944image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexCustKeyItemInvoice DateInvoice_YearInvoice_QuarterInvoice_MonthInvoice_DaySales QuantitySales AmountSales Amount Based on List PriceDiscount AmountSales Margin AmountSales Cost AmountSales RepU/MList PriceSales Price
0010000481Urban Large Eggs2017-04-30201724301237.910.000-237.910237.910.0184EA0.000237.910000
1110002220Moms Sliced Turkey2017-07-14201737141456.17824.960368.790456.170.0127EA824.960456.170000
2210002220Cutting Edge Foot-Long Hot Dogs2017-10-172017410171438.93548.660109.730438.930.0127EA548.660438.930000
3310002489Kiwi Lox2017-06-0320172631211.750.000-211.750211.750.0160EA0.000211.750000
4410004516High Top Sweet Onion2017-05-272017252745589248.66185876.60096627.94089248.660.0124SE408.520196.150901
5510004516Best Choice Fudge Brownies2017-05-302017253011950.000.000-1950.0001950.000.0124EA0.0001950.000000
6610007866Moms Sliced Turkey2017-09-0320173931424.30795.314371.014424.300.0149EA795.314424.300000
7710009356Tell Tale Garlic2017-06-18201726182541.921150.000608.080541.920.0103EA575.000270.960000
8810009356High Top Walnuts2017-06-182017261815353.40778.200424.800353.400.0103EA51.88023.560000
9910009356Big Time Frozen Cheese Pizza2017-06-18201726186011229.0024721.80013492.80011229.000.0103EA412.030187.150000

Last rows

df_indexCustKeyItemInvoice DateInvoice_YearInvoice_QuarterInvoice_MonthInvoice_DaySales QuantitySales AmountSales Amount Based on List PriceDiscount AmountSales Margin AmountSales Cost AmountSales RepU/MList PriceSales Price
652706527210017638Blue Label Canned Beets2018-03-21201813212671.901268.22596.32127.79544.11180EA634.11335.950000
652716527310017638Moms Sliced Turkey2018-03-2120181321125244.769899.524654.762186.283058.48180EA824.96437.063333
652726527410017638Gorilla Strawberry Yogurt2018-03-2120181321181783.403366.181582.781015.17768.23180EA187.0199.077778
652736527510017638Gorilla Jack Cheese2018-03-212018132121110.302206.001095.70265.75844.55180EA1103.00555.150000
652746527610017638Blue Label Fancy Canned Oysters2018-03-212018132140312.79590.40277.6144.39268.40180EA14.767.819750
652756527710017638High Top Oranges2018-03-21201813219569.901075.68505.78329.95239.95180EA119.5263.322222
652766527810017638Landslide White Sugar2018-03-21201813212462.81873.56410.7539.26423.55180EA436.78231.405000
652776527910017638Moms Potato Salad2018-03-21201813218987.201863.36876.16413.20574.00180EA232.92123.400000
652786528010017638Better Fancy Canned Sardines2018-03-21201813213627297.5151524.2824226.7711108.6116188.90180EA1431.23758.264167
652796528110017638Imagine Popsicles2018-03-21201813214827582.0252061.2824479.2613347.8014234.22180EA1084.61574.625417